4 research outputs found

    Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques

    Get PDF
    Determining the underlying haplotypes of individual human genomes is an essential, but currently difficult, step toward a complete understanding of genome function. Fosmid pool-based next-generation sequencing allows genome-wide generation of 40-kb haploid DNA segments, which can be phased into contiguous molecular haplotypes computationally by Single Individual Haplotyping (SIH). Many SIH algorithms have been proposed, but the accuracy of such methods has been difficult to assess due to the lack of real benchmark data. To address this problem, we generated whole genome fosmid sequence data from a HapMap trio child, NA12878, for which reliable haplotypes have already been produced. We assembled haplotypes using eight algorithms for SIH and carried out direct comparisons of their accuracy, completeness and efficiency. Our comparisons indicate that fosmid-based haplotyping can deliver highly accurate results even at low coverage and that our SIH algorithm, ReFHap, is able to efficiently produce high-quality haplotypes. We expanded the haplotypes for NA12878 by combining the current haplotypes with our fosmid-based haplotypes, producing near-to-complete new gold-standard haplotypes containing almost 98% of heterozygous SNPs. This improvement includes notable fractions of disease-related and GWA SNPs. Integrated with other molecular biological data sets, this phase information will advance the emerging field of diploid genomics

    Multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes

    Get PDF
    To fully understand human biology and link genotype to phenotype, the phase of DNA variants must be known. Here we present a comprehensive analysis of haplotype-resolved genomes to assess the nature and variation of haplotypes and their pairs, diplotypes, in European population samples. We use a set of 14 haplotype-resolved genomes generated by fosmid clone-based sequencing, complemented and expanded by up to 372 statistically resolved genomes from the 1000 Genomes Project. We find immense diversity of both haploid and diploid gene forms, up to 4.1 and 3.9 million corresponding to 249 and 235 per gene on average. Less than 15% of autosomal genes have a predominant form. We describe a ‘common diplotypic proteome’, a set of 4,269 genes encoding two different proteins in over 30% of genomes. We show moreover an abundance of cis configurations of mutations in the 386 genomes with an average cis/trans ratio of 60:40, and distinguishable classes of cis- versus trans-abundant genes. This work identifies key features characterizing the diplotypic nature of human genomes and provides a conceptual and analytical framework, rich resources and novel hypotheses on the functional importance of diploidy

    The Molecular Switching Mechanism at the Conserved D(E)RY Motif in Class-A GPCRs

    Get PDF
    The disruption of ionic and H-bond interactions between the cytosolic ends of transmembrane helices TM3 and TM6 of class-A (rhodopsin-like) G protein-coupled receptors (GPCRs) is a hallmark for their activation by chemical or physical stimuli. In the bovine photoreceptor rhodopsin, this is accompanied by proton uptake at Glu134 in the class-conserved D(E)RY motif. Studies on TM3 model peptides proposed a crucial role of the lipid bilayer in linking protonation to stabilization of an active state-like conformation. However, the molecular details of this linkage could not be resolved and have been addressed in this study by molecular dynamics (MD) simulations on TM3 model peptides in a bilayer of 1,2-dioleoyl-sn-glycero-3-phosphocholine (DOPC). We show that protonation of the conserved glutamic acid alters the peptide insertion depth in the membrane, its side-chain rotamer preferences, and stabilizes the C-terminal helical structure. These factors contribute to the rise of the side-chain pKa (> 6) and to reduced polarity around the TM3 C terminus as confirmed by fluorescence spectroscopy. Helix stabilization requires the protonated carboxyl group; unexpectedly, this stabilization could not be evoked with an amide in MD simulations. Additionally, time-resolved Fourier transform infrared (FTIR) spectroscopy of TM3 model peptides revealed a different kinetics for lipid ester carbonyl hydration, suggesting that the carboxyl is linked to more extended H-bond clusters than an amide. Remarkably, this was seen as well in DOPC-reconstituted Glu134- and Gln134-containing bovine opsin mutants and demonstrates that the D(E)RY motif is a hydrated microdomain. The function of the D(E)RY motif as a proton switch is suggested to be based on the reorganization of the H-bond network at the membrane interface

    A comprehensively molecular haplotype-resolved genome of a European individual

    No full text
    Independent determination of both haplotype sequences of an individual genome is essential to relate genetic variation to genome function, phenotype, and disease. To address the importance of phase, we have generated the most complete haplotype-resolved genome to date, “Max Planck One” (MP1), by fosmid pool-based next generation sequencing. Virtually all SNPs (>99%) and 80,000 indels were phased into haploid sequences of up to 6.3 Mb (N50 ∌1 Mb). The completeness of phasing allowed determination of the concrete molecular haplotype pairs for the vast majority of genes (81%) including potential regulatory sequences, of which >90% were found to be constituted by two different molecular forms. A subset of 159 genes with potentially severe mutations in either cis or trans configurations exemplified in particular the role of phase for gene function, disease, and clinical interpretation of personal genomes (e.g., BRCA1). Extended genomic regions harboring manifold combinations of physically and/or functionally related genes and regulatory elements were resolved into their underlying “haploid landscapes,” which may define the functional genome. Moreover, the majority of genes and functional sequences were found to contain individual or rare SNPs, which cannot be phased from population data alone, emphasizing the importance of molecular phasing for characterizing a genome in its molecular individuality. Our work provides the foundation to understand that the distinction of molecular haplotypes is essential to resolve the (inherently individual) biology of genes, genomes, and disease, establishing a reference point for “phase-sensitive” personal genomics. MP1's annotated haploid genomes are available as a public resource
    corecore